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ABSTRACT 

Transcription activator-like effector (TALE) DNA 
binding proteins show tremendous potential as mo- 
lecular tools for targeted binding to any desired DNA 
sequence. Their DNA binding domain consists of 
tandem arranged repeats, and due to this repetitive 
structure it is challenging to generate designer 
TALEs (dTALEs) with user-defined specificity. We 
present a cloning approach that facilitates the 
assembly of multiple repeat-encoding DNA frag- 
ments that translate into dTALEs with pre-defined 
DNA binding specificity. This method makes use of 
type IIS restriction enzymes in two sequential 
cut-ligase reactions to build dTALE repeat arrays. 
We employed this modular approach for generation 
of a dTALE that differentiates between two highly 
similar DNA sequences that are both targeted by 
the Xanthomonas TALE, AvrBs3. These data show 
that this modular assembly system allows rapid 
generation of highly specific TALE-type DNA 
binding domains that target binding sites of prede- 
fined length and sequence. This approach enables 
the rapid and flexible production of dTALEs for gene 
regulation and genome editing in routine and 
high-throughput applications. 

INTRODUCTION 

DNA binding domains that can be tailored to interact 
with user-defined DNA sequences are crucial tools for 
molecular biology (1). Bacterial transcription activator- 
like effector proteins (TALEs) from the bacterial 
pathogen Xanthomonas target DNA via a novel type of 
DNA binding domain that is composed of tandem- 
arranged 33-35 amino acid repeat-modules, with each 



repeat binding to one base (2,3). Base preferences of indi- 
vidual repeats are specified by residues 12 and 13, known 
as the repeat variable diresidues (RVDs), that determine 
preferential pairing with A (NI), C (HD), G (NK) and T 
(NG) nucleotides, respectively (2-5). In principle, the use 
of this TALE code facilitates the assembly of repeat arrays 
that bind to any desired DNA sequence that is preceeded 
by a T nucleotide (2,6,7). Recent studies have also shown 
that TALE repeats function as sequence specific targeting 
modules not only in the context of a transcription factor 
but also when fused to a Fokl nuclease domain (4,8-10). 
This suggests that the TALE DNA binding domain 
enables applications comparable to zinc finger (ZF) tech- 
nology (11). ZFs that are assembled into an array are 
known to influence the DNA specificity of adjacent ZFs 
(12) and, due to this context dependency, ZF arrays of 
desired DNA specificity require experimental validation. 
In contrast, there is no evidence so far that 
base-preferences of TALE repeats are context dependent 
and recent studies have demonstrated that in vitro 
assembled repeat arrays target the pre-defined nucleotide 
sequences (4,5,8,10,13). 

A major hurdle to the routine application of the TALE 
DNA binding domain is that the assembly of genes 
encoding tandemly arranged repeats is difficult to 
achieve via standard cloning approaches. To address this 
need, we developed a rapid, efficient, low-cost approach 
for engineering TALE-type DNA binding domains with 
custom specificities, which involves fusing individual 
TALE repeats into a desired array. Functional repeat 
arrays can be cloned directly into an expression vector 
or alternatively into a Gateway-compatible entry vector 
that facilitates recombination based transfer into any 
desired Gateway-compatible expression vector. We also 
demonstrate that TALE repeat arrays, which have 
been assembled by this cloning approach, target in vivo 
the pre-defined DNA sequences with high sequence 
specificity. 
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MATERIAL AND METHODS 

Vector construction 

The Bsal recognition site in pUC57 was mutagenized with 
primers RM1 and RM2 to obtain pUC57AfiraI (for 
primer sequences see Supplementary Table SI). 
Repeat-modules were amplified from pENTR-D-avrfoi 
with primers adding Bsal recognition and restriction 
sites at the 5'- and 3'-end and were blunt end cloned 
into pUC57A&aI (sequences of repeat modules are 
provided in Supplementary Figure SI; primer sequences 
are provided in Supplementary Table SI). RVD encoding 
nucleotides were modified via site-directed mutagenesis to 
obtain all six different RVDs (for nucleotide sequence of 
each RVD see Supplementary Figure SI). pENTR-D- 
avrBs3 had been amplified with primer RM3 and RM4 
to create pENTR-D-TALE-Arep-foal-AC. To create 
pUC57-AB-DEST and BC-DEST, first the aadA gene 
was amplified from pGWB441 (14) and cloned by blunt 
end ligation into the Seal restriction site of pUC57A_6.V(2l 
using RM5 and RM6. Second, the Bsal and Bpil restric- 
tion and recognition sites were added using primer RM7 
with RM8 and RM9 with RM10, respectively, and the 
PCR fragment obtained was ligated to a PCR fragment 
that had been amplified from pGWB441 using primers 
RM11 and RM12. pENTR-D-avr&i had been amplified 
with primers RM13 and RM14 to create pENTR-D- 
TALE-Arep-fip/'I-AC. The RVDs in pENTR-D-TALE- 
Arep-Bpil-AC were changed using site-directed mutagen- 
esis. Mutations were introduced with the Phusion 
site-directed mutagenesis kit (New England Biolabs). 
All Bsal and Bpil recognition sites in pGWB5 (14) 
were mutagenized as described previously (15) to create 
pGWB5* (for sequence see Supplementary Figure S4). 
The set of plasmids that's required to generate genes 
encoding dTALE repeat arrays has been deposited in the 
non-profit plasmid repository Addgene (http://www 
.addgene.org). 

Cut-ligation cloning protocol 

For cut-ligation reaction 40 fmol of each plasmid, ligation 
buffer (Fermentas), 1 5 U of either Bsal or Bpil and 1 5 U 
high-concentrated T4 DNA ligase were used in a 20 ul 
volume. The reaction was incubated in a thermo cycler 
with the following program: 5min 37°C, 5min 20°C, 50 
cycles, lOmin 50°C and lOmin 80°C. One microliter 
reaction mix was added to 50 ul chemical competent 
TOP 10 cells, incubated for 15min on ice and transformed 
by heat shock. Clones were analysed by colony PCR, 
restriction and sequencing. 

In planta analysis 

All entry clones were transferred by LR recombination 
(Invitrogen) into the expression vector pGWB5 (14) and 
transformed into Agrobacterium tumefaciens GV3101 (16) 
for in planta analysis. Sequences and generation of all 
promoter constructs have been described previously (5). 
GUS measurements were carried out as described previ- 
ously (17). 



RESULTS 

Assembly of TALE genes by a one-step Bsal cut-ligation 

Our TALE repeat assembly kit is based on type IIS re- 
striction enzymes that cleave outside of their recognition 
site and produce a 4-bp 5' overhang (15). Since recogni- 
tion and cleavage sites are spatially separated in type IIS 
restriction enzymes, proper construct design facilitates 
cleavage-mediated generation of theoretically any desired 
4-bp overhang for a given DNA substrate. Thus jigsaw 
puzzle-like directional assembly of multiple DNA frag- 
ments is feasible (18,19). Another important aspect of 
type IIS mediated cloning [TUS-cloning; synonym: 
Golden Gate cloning (15,18,19)] is that the desired 
ligation products lack the recognition sites of the given 
type IIS endonuclease. Thus cleavage and ligation can 
be carried out simultaneously (cut-ligation) rather than 
sequentially. 

We aimed to establish a toolkit that enables TIIS- 
mediated assembly of genes encoding 20 or more TALE 
repeats. The length of the corresponding TALE target 
sites should facilitate specific targeting of a unique DNA 
sequence even in the context of a highly complex eukary- 
otic genome. To do so, we PCR-amplified and cloned 
DNA fragments encoding individual repeats (herein 
referred to as repeat-modules) from the TALE gene 
avrBs3 into a modified pUC57 cloning vector 
(Supplementary Figure SI). To avoid vector-derived 
Bsal cleavage products that potentially interfere with the 
envisaged multi-fragment ligation, we removed the Bsal 
recognition site from pUC57 by site-directed mutagenesis, 
yielding pUC57A&aI. The cloned repeat-modules served 
as initial building blocks for the envisaged repeat 
assembly. The primers introduced Bsal sites at both 
termini of the repeat-module in such a way that both rec- 
ognition sites are cleaved off from the repeat-module 
(Figure la). Initially we cloned 10 distinct repeat-modules 
each producing a different combination of terminal over- 
hangs upon Bsal cleavage. The overhangs were designed 
so that each repeat-module would ligate only to specific 
repeat-modules in such a way as to generate a pre-defined 
linear repeat array. The first repeat-module of an array 
will ligate specifically to the 5'-end of the second 
repeat-module. The 3'-end of the second repeat-module 
will ligate specifically to the 5' -end of the third 
repeat-module and so on. Each repeat-module array 
consists of: (i) a 5'-adaptor repeat-module; (ii) a variable 
number of core repeat-modules; and (hi) a 3'-adaptor 
repeat-module. The position of each repeat-module 
within an array is defined exclusively by the given 
overlap. To get a sufficient number of different overlaps 
for the distinct repeat-modules, we made use of the degen- 
eracy of the genetic code and incorporated different 
codons for identical amino acids at the repeat-module 
fusion points. In addition, the fusion point between the 
repeat-modules was varied (Supplementary Figure SI). 
Thus not every repeat-module does encode a complete 
34-amino acid repeat. However, each repeat-module 
encodes one pair of RVDs and thus determines the base 
preference of the given repeat. The overlaps were designed 
in such a way that correctly assembled repeat-modules 
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Figure 1. Assembly of dTALE arrays with 10 repeat-modules via Bsal cut-ligation. (a) The cut-ligation concept is shown for two representative 
repeat-modules that are displayed as white boxes (repl and rep2). Bsal recognition sites are shown as pentagons with black arrowheads pointing to 
the cleavage site. Coloured boxes represent Bsal cleavage sites and colour identity indicates ends with compatible overlaps. The line connecting both 
Bsal sites represents the vector backbone. Bsal cleavage releases repeat-modules (left side) and creates distinct overlaps at the 5'- and 3'-ends on each 
repeat-module. Ligation of two repeat-modules results in ligation products that lack Bsal recognition sites (far right side). By contrast, re-ligation of 
repeat-modules into their donor vectors results in plasmids that still contain Bsal recognition sites, (b) Repeat-modules that are used for dTALE 

(continued) 
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translate into an array of tandemly arranged 34-amino 
acid repeats. For each type of repeat-module, we 
generated variants encoding the RVDs NI (A), HD (C), 
NK (G) NN (G/A), NG (T) and NS (A/C/G/T) by 
site-directed mutagenesis (Figure lb). 

Next we generated a TALE gene deletion construct that 
lacks the repeat array encoding region and that allows 
in vitro integration of multiple repeat-modules by Bsal 
cut-ligation thereby generating a functional, full-length 
designer TALE (dTALE) gene. The deletion was generated 
using an avrBs3 gene that is flanked by attL sites 
(pENTR-D-avrfo3). Thus an assembled dTALE gene 
can be transferred by recombination into any desired 
Gateway-compatible expression vector. To generate this 
vector, 17 of the 17.5 repeat-modules of avrBs3 were 
removed and in their place a cassette with two Bsal rec- 
ognition sites facing in opposite directions was inserted, 
yielding TALE-zl rep-foal- AC (Figure lc, A and C 
denotes two distinct foal-generated overlaps). The 
cassette is designed in such a way that Bsal cleavage will 
release a fragment containing both Bsal recognition sites 
and generate overhangs that will ligate to the 5'- and 
3'-adaptor repeat-modules (Figure lc). The repeat- 
deficient TALE gene (TALE-zl rep-foal-AC) was cloned 
into a vector containing a kanamycin selection marker, 
whereas all repeat-modules were cloned into vectors 
encoding an ampicillin resistance. Thus recovery of 
cloned repeat-modules can be easily avoided by using 
kanamycin-containing medium. 

Next we carried out Bsal cut-ligations with equimolar 
amounts of 10 repeat-modules and the repeat-deficient 
TALE gene to generate a functional dTALE gene. 
Restriction and sequence analysis of the observed 
plasmids showed that 90% of the in vitro generated 
dTALE genes contained the pre-defined repeat arrays 
(Figure Id). Since dTALEs with 10 repeats are unlikely 
to be long enough to bind to a unique sequence within the 
context of a complex genome, we carried out cut-ligations 
with 20 repeat-modules using a Bsal cut-ligation. 
However, many of the observed arrays had <20 repeats, 
and those that contained 20 repeats did not show the 
pre-defined order of repeat-modules. We anticipated that 
optimization of the procedure might enable us to produce 
repeat arrays containing 20 repeat-modules. However, 
given the problems that we experienced with the 
assembly of 20 repeat-modules it seemed unlikely that it 
would be simple to produce routinely arrays consisting of 
30 or more repeat-modules by a single step cut-ligation 
with an acceptable efficiency. 



Assembly of dTALE genes by two subsequent cut-ligations 

Given that 10 repeats could be assembled efficiently in a 
Bsal cut-ligation (Figure Id), we decided to generate 
repeat arrays by ligation of two foal-generated sub-arrays 
into a repeat-deficient TALE gene by a subsequent 
cut-ligation. The second cut-ligation is carried out with 
the type IIS enzyme Bpil that, like Bsal, produces 4-bp 
overhangs. For simplicity, we refer to the individual 
repeat-modules that are flanked by Bsal recognition sites 
as level 1 modules and to corresponding sub-arrays as 
level 2 modules. We generated two distinct level 2 
modules containing 10 and 7 level 1 modules, respectively. 
Both level 2 modules are assembled with mostly identical 
core repeat-modules but differ in their 5'- and 3'-terminal 
adaptor modules. Therefore in addition to A and C, we 
defined a new B overlap and generated 5' and 3' adaptor 
modules that connect both level 2 modules (Figure 2a 
and b). For the assembly of these level 2 modules, we 
generated two pUC57 derivatives (pUC57-AB-DEST; 
pUC57-BC-DEST), herein referred to as level 2 destin- 
ation vectors in which level 1 modules can be assembled 
by Bsal cut-ligation and from which level 2 modules can 
be subsequently released by Bpil cleavage. In level 2 des- 
tination vectors, Bsal and Bpil are positioned in inverse 
orientation relative to each other but create identical 
overlaps (Figure 2b). The two types of level 2 modules 
(AB and BC level 2 modules, see Figure 2c) produce 
distinct overhangs when being released by Bpil cleavage, 
which facilitates specific ligation of the two level 2 
modules to each other. For the assembly of two level 2 
modules into a functional dTALE gene by fo>/I-mediated 
cut-ligation, we generated a construct containing a repeat- 
deprived TALE gene, herein referred to as level 3 destin- 
ation vector. The level 3 destination vector is basically 
identical to the above-described repeat-deficient TALE 
gene construct (TALE- A rep-foal- AC) but contains two 
central Bpil instead of foal sites (TALE-zlrep-fo)/I-AC). 
Using a Bpil cut-ligation, two distinct level 2 modules are 
fused into a level 3 destination vector to encode a func- 
tional dTALE gene (Figure 2c). The level 3 destination 
vector also encodes the last half repeat that defines the 
C-terminal end of each TALE repeat array. In order to 
have the possibility also to select desired RVDs for this 
terminal half-repeat, we generated six different level 3 des- 
tination vectors that encode the RVDs NI (A), HD (C), 
NK (G), NN (G/A), NG (T) and NS (A/C/G/T). 

In summary, with these materials in hand, we can 
generate any dTALE gene encoding 17.5 or 20.5 repeats 
with pre-defined RVD composition in just two subsequent 



Figure 1. Continued 

assembly are represented as boxes. 5' and 3' adaptor modules are shown in yellow. Core repeats are shown in white. The position (repl-replO), RVD 
(NI, HD, NK, NN, NG and NS) and overlaps (boldface font) of every module is written in the box. (c) Assembly of a dTALE gene with 10.5 repeats 
via Bsal cut-ligation. Adaptor and core repeats are shown as yellow and white boxes, respectively. Boldface font indicates the unique overlaps of 
modules created by Bsal cleavage. Red dots indicate cloned core repeats 3-8 that are not displayed. Purple boxes represent the regions encoding the 
N- and C-terminal parts of the TALE (N- and C-term) including the last half repeat (rep 10.5). Lines connecting the boxes represent the vector 
backbones that mediate either ampicillin (red line) or kanamycin (blue line) resistance. The Bsal cut-ligation assembles repeat-modules 1-10 into the 
TALE-Arep-B.sal-AC vector. The assembled dTALE gene does not contain any Bsal recognition sites. Black dots indicates repeat modules of the 
assembled array (3-8) that are not displayed, (d) PCR was used to analyse colonies obtained in a representative Bsal cut-ligation. PCR fragments 
from 17 colonies (1-17) were separated on a 1% agarose gel and stained with ethidium bromide. The expected size of the PCR product was 1.3kb 
and is marked with an asterisk. M: GeneRuler 1 kb DNA Ladder (Fermentas). 
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Figure 2. Assembly of a dTALE gene with a 17.5 repeat array by two subsequent cut-ligations. If not specified, the shapes and lines are as described 
in the legend of Figure 1. (a) Adaptor and core repeat-modules that are used for dTALE assembly. Distinct overlaps of repeat modules are defined in 
bold font (A, B, C, 1-9). (b) Two Bsal cut-ligation facilitate assembly of two distinct level 2 modules that contain 7 and 10 repeat-modules, 
respectively. Bsal and Bpil recognition sites are shown as pentagons with black arrowheads pointing to their cleavage sites. Level 2 destination 
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cut-ligations. Notably, the majority of naturally occurring 
Xanthomonas TALEs, including the well-studied AvrBs3 
protein contains 17.5 repeats. Thus the chosen architec- 
ture enables us to directly compare the specificity of 
Xanthomonas TALEs and in vitro generated dTALEs. 

We carried out a number of Bsa\ cut-ligations and 
observed that desired AB level 2 modules (10 repeat- 
modules) and BC level 2 modules (7 repeat-modules) 
were observed in 50 and 95% (Figure 3a). The subsequent 
Bpil cut-ligations produced on average 95% clones of 
correct size (Figure 3b). Sequence analysis of ~ 100 
assembled dTALE genes that showed the correct size in 
gel electrophoresis did not uncover a single mutation. This 
extremely high level of sequence fidelity is most likely due 
to the fact that our approach relies on sequence-validated 
plasmid DNA and does not involve PCR. Thus the gen- 
eration of clTALE genes by two subsequent cut-ligations 
worked with high efficiency and fidelity. 

Functional analysis of dTALEs with target-optimized 
RVD composition 

To test the functionality and specificity of in vitro 
generated dTALE genes, we took advantage of two 
sequence-related 19-bp AvrBs3 target boxes [up-regulated 
by TALE AvrBs3 (UPT AvtBs3 ) boxes] that are present in 
the pepper Bs3 promoter (Bs3pUPTfi LVTBs 3 box) (6,17) and 
the pepper UPA20 promoter (UPA20pUPT AYrBs3 box) 
(20). The 19-bp AvrBs3 target sites in Bs3 and UPA20 
are mostly identical but differ in four basepairs 
(Supplementary Figure S2). Given that AvrBs3 targets 
both UPT boxes, we wondered if we could generate a 
dTALE gene with an identical number of repeat-modules 
as avrBs3 that, however, due to target-adapted design of 
the encoded RVDs, would specifically activate promoters 
containing the UPA20pUPT AvrBs3 box but not promoters 
containing the related Bs3pUPT AvrBs3 box. We generated 
the dTALE f UPA20 J via two subsequent cut-ligations and 
transferred it via recombination into the plant-expression 
vector pGWB5 (GenBank: AB289768.2). In this in planta- 
expression vector, a given gene is driven by the constitu- 
tive cauliflower mosaic virus 35S promoter (35S). 

To study target specificity of dTALE[C/P^420] in vivo 
we made use of two previously established GUS-reporter 
constructs in which the two distinct UPT AvtB3 boxes 
from the Bs3 and UPA20 promoter are embedded in 
an identical promoter context (5). Agrobacterium 
tumefaciens-mediated delivery of the TALE and dTALE 
genes in Nicotiona henthamiana leaves showed that 
dTALE[t/Pv420] produced GUS staining only in combin- 
ation with the promoter containing the matching 
UPA20pUPT AvrBs 2 box but not with the promoter 



containing the highly similar Bs3pUPT AviBs3 box. In 
contrast, AvrBs3 produced GUS activity with both 
promoter constructs (Figure 4). Thus the dTALE gene 
produced by modular cloning was functional in vivo. 
Furthermore the target-adapted RVD composition 
enabled us to generate a dTALE that, in contrast to 
AvrBs3, discriminated between two highly similar target 
sequences. 

A modified expression vector simplifies generation of 
dTALE expression constructs 

In the above described approach, functional analysis of 
a given dTALE gene requires recombination-based 
transfer from the entry into a desired expression vector. 
Implementation of Gateway technology provides a high 
level of flexibility since assembled dTALE genes can be 
transferred into any Gateway-compatible expression 
vector. However, in principle, level 2 modules that 
encode repeat sub-arrays can also be assembled directly 
by Bpil cut-ligation to a functional dTALE gene within the 
framework of a desired expression vector. This approach 
allows assembly in two rather than three steps and avoids 
the rather costly Gateway cloning step. Direct Tils 
mediated cloning requires suitable expression vectors 
that must by devoid of recognition sites for the lis 
enzyme that is used in the assembly of level 2 modules. 
Inspection of the in planta expression vector pGWB5 
(17 961 bp) revealed nine Bpil, and two Bsal sites. We 
decided to remove both, Bpil and Bsal recognition sites 
since this would enable us to use this expression vector in 
cut-ligations with Bpil or Bsal. 

To remove Bpil and Bsal recognition sites, we amplified 
11 subfragments of pGWB5, each with primers that 
overlap with internal Bpil or Bsal sites but contain 
single nucleotide mismatches to eliminate the given recog- 
nition sequence (Supplementary Figure S3). Each of the 
generated 11 pGWB5-derived PCR-fragments contained 
at its far end a Bsal recognition site that is cleaved off 
upon Bsal treatment. We designed the overlaps that are 
generated upon Bsal cleavage in such a way that the 11 
pG WB5-derived fragments would assemble in the desired 
order in a Bsal cut-ligation, yielding pGWB5*. The newly 
assembled pG WB5* is identical in its functional elements 
to pG WB5 but does not contain Bpil or Bsal recognition 
sites. We used Gateway recombination to transfer the 
repeat deprived TALE gene from the level 3 destination 
vector TALE-Arep-S/7zI-AC into pGWB5*, yielding 
pGWB5*-TALE-Arep-5piI-AC. This pGWB5-derivative 
is now a level 3 destination and in planta expression 
vector that can be used for assembly of level 2 modules 
in Bpil cut-ligations (Figure 5a). Regardless of whether 



Figure 2. Continued 

vectors contain two pairs of Bpil and Bsal sites producing identical overlaps (A-C). The grey box represents a Gateway cassette with ccdB gene and 
chloramphenicol (Cml R ) resistance marker. The line connecting the Bsal or Bpil sites represents the vector backbone that mediates either ampicillin 
(red line) or spectinomycin resistance (black line), (c) A Bpil cut-ligation facilitates assembly of two level 2 modules into a functional dTALE gene. 
White rectangles denote the two attachment sites (atthl and attL2) used for LR recombination. Lines connecting the boxes represent the vector 
backbones that mediate either spectinomycin (black line) or kanamycin (blue line) resistance. Bpil cleavage creates overlaps in the level 3 destination 
vectors (A and C) that are complementary to those at the 5'- and 3'-end of the AB and BC level 2 modules, respectively. The generated level 3 
module, which encodes a functional dTALE with 17.5 repeats does not contain Bpil recognition sites. 
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Figure 3. Cut-ligation efficiency for the generation of level 2 and modules, (a) Colony PCR was used to analyse colonies obtained in the cloning of 
AB and BC level 2 modules. PCR fragments from 10 colonies each (1-10 [AB] and 11-20 [BC]) were separated on a 1% agarose gel and stained with 
ethidium bromide. The expected sizes for the two distinct level 2 modules are 1.2 kb [AB] and 0.9 kb [BC], respectively (asterisk). M 1 and M 2 : 
GeneRuler 1 kb and 100 bp DNA ladder from fermentas. (b) Full-length dTALE genes that were generated by Bpil cut-ligation were analysed by 
Stul-Agel double digest. Fragments obtained from 10 distinct colonies (1-10) were separated on a 1% agarose gel and stained with ethidium 
bromide. The expected sizes were 2.5 and 3.5 kb (asterisks). M: GeneRuler 1 kb DNA ladder from fermentas. 



the assembly of clTALE genes is in a three-step procedure 
(involving Gateway recombination into pGWB5) or a 
two-step procedure (involving Bpil cut-ligation with 
pGWB5*-TALE-Arep-fip;I-AC), the resulting T-DNAs 
within the given expression vectors are identical. 

We assembled ~50 repeat arrays by Bpil cut-ligations in 
pGWB5*-TALE-Arep-£p/I-AC and tested generally two 
clones. Each of these TALE arrays showed the desired 
composition of repeat-modules (Figure 5b). Thus the 
assembly of full-length dTALE genes into an expression 
vector allows rapid, and cost-effective cloning into expres- 
sion vectors. Using the two-step cloning procedure, we 
assembled a dTALE gene that was identical in its RVDs 
to AvrBs3. In planta analysis showed that the assembled 
dTALE (dAvrBs3) was functionally indistinguishable 
from the Xanthomonas AvrBs3 protein (Figure 5c). Thus 
the described two-step generation of dTALE expression 
constructs provides a rapid and cost-efficient approach 
for generation dTALE genes. 
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Figure 4. A dTALE with optimized RVD composition discriminates 
between the closely releated sequences present in pepper UPA20 and 
Bs3 promoter. The uidA reporter constructs under transcriptional 
control of the promoters shown at left were delivered via A. tumefaciens 
into N. benthamiana leaves in combination with the 35S 
promoter-driven TALE genes indicated above leaf discs. GUS assays 
were carried out at 40 hpi. Leaf discs were stained with 
5-bromo-4-chloro-3-indolyl-[5-D-glucuronic acid, cyclohexylammonium 
salt (X-Gluc) to visualize activity of the GUS reporter. 



DISCUSSION 

Tils mediated assembly of repeat-modules — a highly 
flexible approach 

We developed a rapid, simple and highly cost-efficient 
approach that facilitates generation of dTALE genes 
that translate into proteins with custom specificity. The 
approach involves two subsequent cut-ligations fusing in- 
dividual sequence-validated cloned repeat-modules into a 
desired array. In the present study, we fused the 
repeat-modules into the context of a transcription factor. 
However, the developed cloning approach is highly 
flexible in several aspects. For example, the designed 
repeat-modules are flexible with respect to the context 
into which they can be cloned. One could easily change 
the level 3 destination vectors to generate TALE nucleases 
of desired specificity instead of dTALE transcription 
factors with desired specificity. Our approach is also 
highly flexible with respect to the size of a repeat array. 
We have fused two repeat sub-arrays (level 2 modules) 
containing 7 and 10 repeat-modules into a functional 
dTALE gene. However, given that 10 repeat-modules 



could be assembled in a cut-ligation with high efficiency, 
it should be possible, by creating the appropriate adaptor 
modules, to generate repeat arrays of 30 or more repeats 
by fusing multiple level 2 modules. By using identical core 
repeat-modules for each level 2 module, the amount of 
effort needed to generate large repeat arrays is limited. 

The hierarchical modular cloning system that we 
present relies on the use of the type lis enzymes Bsal 
and Bpil that are used in subsequent cut-ligations and 
that produce a full-length dTALE gene. In fact the 
described approach also facilitates the generation of 
higher order constructs that can combine multiple 
dTALEs or other functional units in one construct. For 
example, terminal Bsal sites flanking the AT ALE gene in 
level 3 destination vectors could be used to combine 
multiple level 3 modules into a desired level 4 vector. 
Given that genome editing with TALE nucleases generally 
requires two distinct proteins with different repeat arrays 
(11), this might represent a useful extension of the current 
approach. 

Our assembly kit allows incorporation of repeats with 
six distinct RVDs including NK that was recently found to 



Nucleic Acids Research, 2011, Vol. 39, No. 13 5797 



(a) 



c 



Nl 



NG 



AB level 2 module 



35S P 



N-term 



D 0 

lm I I cd I 



IKE 



HD 



NG 



BC level 2 module 



C 

3 



rep 17.5 



C-term 



GFP 



level 3 destination vector 





A 


Nl 


1 


35S P 


N-term 


A 





9 


NG 


B 






B 


HD 


1 





C 


rep 17.5 


C-term 


GFP 


6 


NG 


C 


level 3 module 



avrBs3 
dH' davn 



Figure 5. Direct assembly of two TALE repeat sub-arrays into a modified plant expression vector that lacks Bpil recognition sites. If not specified, 
the shapes and lines are as described in the legend of Figure 1. (a) A Bpil cut ligation facilitates direct assembly of two level 2 modules into an 
expression vector. Blue and green rectangles represent the 35S promoter (35S P) and the C-terminal epitope tag (GFP). The blue and black lines that 
connect the boxes represent the vector backbones that contain spectinomycin and kanamycin resistance markers, respectively, (b) Cut-ligation 
efficiency observed in i?/;/I-mediated generation of dTALE gene expression constructs. Plasmids obtained in the cloning of two level 2 modules 
into the level 3 destination vector were analysed by Hin&lll and Sad digestion. Fragments obtained from 10 colonies (1-10) were separated on a 1% 
agarose gel and stained with ethidium bromide. The expected sizes were 0.8, 4.3 and 14.6 kb are marked (asterisks). M: GeneRuler 1 kb DNA Ladder 
(Fermentas). (c) Functional analysis of a dTALE that was assembled via cut ligation into the modified in planta expression vector pGWB5*. The 
de novo assembled dTALE gene (davrBs3) is identical in its RVD composition to the Xanthomonas avrBs3 gene. Agrobacterium tumefaciens was 
transformed with either pGWB5* containing davrBsS or pGWB5 containing avrBs3. Dashed lines mark the inoculated areas. Using A. tumefaciens 
transient transformation the two TALE genes were delivered into the leaf of a Capsicum anmium genotype that contains the Bs3 resistance gene 
(ECW-30R). Two days after infiltration, the leaves were cleared in ethanol to visualize the AvrBs3-triggered and B.vi-mediated hypersensitive 
response (dark areas). 
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interact with G bases preferentially (4,5) and NS, which 
has been shown to target A, C, G and T bases with almost 
identical affinity (2). Thus our assembly kit allows gener- 
ation of dTALEs with high sequence specificity as well as 
degeneracy within defined positions of the given target 
site. 

Previous studies have often made use of conserved pairs 
of restriction enzyme sites flanking the repeat region 
(e.g. BamHI, Sphl) to move repeat arrays into vectors 
containing a TALE backbone (21). These conserved 
pairs of restriction sites are also present in the backbone 
sequence used in our modular assembly kit, derived from 
the Xanthomonas avrBs3 gene. Thus constructs made via 
our approach are compatible for cloning into such existing 
vectors. 

Design of TALE repeat arrays — how to maximize 
target specificity 

Target specificity is a major issue in the generation of 
dTALE repeat arrays and is influenced by the repeat 
number and type of RVDs. The well-studied TALE 



AvrBs3, that binds to a 19-bp target sequence, contains 
three NS-type RVDs, which have been shown to target A, 
C, G and T nucleotides with almost identical affinity (2). 
We assumed that target specificity of AvrBs3 could be 
improved if RVDs with ambiguous target specificity are 
replaced with RVDs with tight sequence specificity. To 
test this hypothesis we made use of two similar 19-bp 
AvrBs3 target sequences in the pepper Bs3 and pepper 
UPA20 promoter that differ in four basepairs. We 
generated a dTALE that has the same number of 
repeats as AvrBs3 but that does not contain RVDs with 
ambiguous target specificity and that was designed to dif- 
ferentiate between the two similar AvrBs3 target sites in 
the Bs3 and UPA20 promoters. Indeed, this dTALE with 
optimized RVD composition discriminated between the 
two similar AvrBs3 target sequences and activated specif- 
ically the pepper UPA20 but not the Bs3 promoter. These 
data demonstrate that a target-adapted RVD composition 
facilitates generation of repeat arrays with high specificity. 
Previously, we engineered two AvrBs3 derivatives with 
four additional repeat units that target a 23-bp target 
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box and that activate specifically either the pepper Bs3 or 
UPA20 promoter (5). Thus we could show that RVD com- 
position as well as the size of the repeat array affects target 
specificity. 

Alternative approaches for assembly of TALE repeats 

A most recent manuscript provides an alternative protocol 
for modular assembly of dTALE genes that encode 12.5 
repeats (13). The approach relies on four cloned repeat 
monomers (NI, HD, NN, NG) that are linked to 
adaptors with type lis recognition sites via PCR, which 
facilitates assembly of a desired repeat array into a TALE 
gene deletion construct by two sequential ligations. Thus 
this approach facilitates assembly of TALE arrays using 
four cloned repeat monomers and TALE gene deletion 
constructs, respectively. At first sight, the method is very 
attractive, because only a few gene constructs are needed 
and thus the upfront work is quite limited. On the other 
hand, the assembly procedure is rather complex and la- 
borious and involves amplification of twelve individual 
repeat-modules, subsequent PCR product purification, 
Bsal cleavage (generates distinct 4-bp overlaps for each 
module), purification of cleavage products, ligation of 
three sub-arrays (each with four repeats), gel purification 
and subsequent PCR amplification of the three repeat 
sub-arrays, subsequent PCR product purification, 
cleavage of repeat sub-arrays and vector (contains 
TALE gene deletion that lacks the repeat array), purifica- 
tion of cleavage products and finally ligation of three 
sub-arrays into the TALE gene backbone. In summary, 
the approach involves two rounds of PCRs, five purifica- 
tion steps and two ligations. Our assembly kit facilitates 
assembly of 17.5 TALE repeats in two subsequent 
single-tube cut-ligations. Importantly our approach does 
not require purification or PCR steps, instead relying on 
ligation of sequence-validated plasmid inserts. Thus 
assembly of dTALE genes with this approach is less la- 
borious and results in arrays with 17.5 instead of 12.5 
repeats. Since our approach relies on ligation of 
sequence-validated plasmids it seems likely that the 
accuracy of these dTALE genes is higher as compared to 
dTALE genes that were produced by ligation of amplifi- 
cation products that have undergone two subsequent 
PCRs. While assembly of dTALE genes by our assembly 
kit has in many ways advantages, it needs to be 
emphasized that our kit required 80 gene constructs (72 
repeat-modules, two level 2 destination and six level 3 des- 
tination vectors) and therefore had a substantial upfront 
workload. In its current state, our kit enables the produc- 
tion of many dTALE genes quickly, at low cost and with 
high efficiency. Since the established assembly approach 
does not involve any complex manual procedures it could 
be carried out by a pipetting robot. This would facilitate 
implementation of TALE technology in high throughput 
applications. 
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